AITopics | correlation 0

Collaborating Authors

correlation 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Edge-aware baselines for ogbn-proteins in PyTorch Geometric: species-wise normalization, post-hoc calibration, and cost-accuracy trade-offs

Stanković, Aleksandar, Lisica, Dejan

arXiv.org Artificial IntelligenceNov-18-2025

We present reproducible, edge-aware baselines for ogbn-proteins in PyTorch Geometric (PyG). We study two system choices that dominate practice: (i) how 8-dimensional edge evidence is aggregated into node inputs, and (ii) how edges are used inside message passing. Our strongest baseline is GraphSAGE with sum-based edge-to-node features. We compare LayerNorm (LN), BatchNorm (BN), and a species-aware Conditional LayerNorm (CLN), and report compute cost (time, VRAM, parameters) together with accuracy (ROC-AUC) and decision quality. In our primary experimental setup (hidden size 512, 3 layers, 3 seeds), sum consistently beats mean and max; BN attains the best AUC, while CLN matches the AUC frontier with better thresholded F1. Finally, post-hoc per-label temperature scaling plus per-label thresholds substantially improves micro-F1 and expected calibration error (ECE) with negligible AUC change, and light label-correlation smoothing yields small additional gains. We release standardized artifacts and scripts used for all of the runs presented in the paper.

artificial intelligence, machine learning, threshold, (18 more...)

arXiv.org Artificial Intelligence

2511.1325

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Variable Selection Using Relative Importance Rankings

Chang, Tien-En, Chen, Argon

arXiv.org Machine LearningSep-16-2025

Although conceptually related, variable selection and relative importance (RI) analysis have been treated quite differently in the literature. While RI is typically used for post-hoc model explanation, this paper explores its potential for variable ranking and filter-based selection before model creation. Specifically, we anticipate strong performance from the RI measures because they incorporate both direct and combined effects of predictors, addressing a key limitation of marginal correlation that ignores dependencies among predictors. We implement and evaluate the RI-based variable selection methods using general dominance (GD), comprehensive relative importance (CRI), and a newly proposed, computationally efficient variant termed CRI.Z. We first demonstrate how the RI measures more accurately rank the variables than the marginal correlation, especially when there are suppressed or weak predictors. We then show that predictive models built on these rankings are highly competitive, often outperforming state-of-the-art methods such as the lasso and relaxed lasso. The proposed RI-based methods are particularly effective in challenging cases involving clusters of highly correlated predictors, a setting known to cause failures in many benchmark methods. Although lasso methods have dominated the recent literature on variable selection, our study reveals that the RI-based method is a powerful and competitive alternative. We believe these underutilized tools deserve greater attention in statistics and machine learning communities. The code is available at: https://github.com/tien-endotchang/RI-variable-selection.

correlation 0, predictor, snr 0, (13 more...)

arXiv.org Machine Learning

2509.10853

Country: Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation

Alimohammadi, Yeganeh, Asgari, Kiana

arXiv.org Machine LearningJul-14-2025

\textit{Mallows model} is a widely-used probabilistic framework for learning from ranking data, with applications ranging from recommendation systems and voting to aligning language models with human preferences~\cite{chen2024mallows, kleinberg2021algorithmic, rafailov2024direct}. Under this model, observed rankings are noisy perturbations of a central ranking $σ$, with likelihood decaying exponentially in distance from $σ$, i.e, $P (π) \propto \exp\big(-β\cdot d(π, σ)\big),$ where $β> 0$ controls dispersion and $d$ is a distance function. Existing methods mainly focus on fixed distances (such as Kendall's $τ$ distance), with no principled approach to learning the distance metric directly from data. In practice, however, rankings naturally vary by context; for instance, in some sports we regularly see long-range swaps (a low-rank team beating a high-rank one), while in others such events are rare. Motivated by this, we propose a generalization of Mallows model that learns the distance metric directly from data. Specifically, we focus on $L_α$ distances: $d_α(π,σ):=\sum_{i=1} |π(i)-σ(i)|^α$. For any $α\geq 1$ and $β>0$, we develop a Fully Polynomial-Time Approximation Scheme (FPTAS) to efficiently generate samples that are $ε$- close (in total variation distance) to the true distribution. Even in the special cases of $L_1$ and $L_2$, this generalizes prior results that required vanishing dispersion ($β\to0$). Using this sampling algorithm, we propose an efficient Maximum Likelihood Estimation (MLE) algorithm that jointly estimates the central ranking, the dispersion parameter, and the optimal distance metric. We prove strong consistency results for our estimators (for any values of $α$ and $β$), and we validate our approach empirically using datasets from sports rankings.

artificial intelligence, machine learning, permutation, (18 more...)

arXiv.org Machine Learning

2507.08108

Country:

North America > United States > Michigan (0.04)
North America > United States > Oklahoma (0.04)
North America > United States > Ohio (0.04)
(9 more...)

Genre:

Workflow (0.93)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Basketball (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A closer look at how large language models trust humans: patterns and biases

Lerman, Valeria, Dover, Yaniv

arXiv.org Artificial IntelligenceApr-23-2025

As large language models (LLMs) and LLM-based agents increasingly interact with humans in decision-making contexts, understanding the trust dynamics between humans and AI agents becomes a central concern. While considerable literature studies how humans trust AI agents, it is much less understood how LLM-based agents develop effective trust in humans. LLM-based agents likely rely on some sort of implicit effective trust in trust-related contexts (e.g., evaluating individual loan applications) to assist and affect decision making. Using established behavioral theories, we develop an approach that studies whether LLMs trust depends on the three major trustworthiness dimensions: competence, benevolence and integrity of the human subject. We also study how demographic variables affect effective trust. Across 43,200 simulated experiments, for five popular language models, across five different scenarios we find that LLM trust development shows an overall similarity to human trust development. We find that in most, but not all cases, LLM trust is strongly predicted by trustworthiness, and in some cases also biased by age, religion and gender, especially in financial scenarios. This is particularly true for scenarios common in the literature and for newer models. While the overall patterns align with human-like mechanisms of effective trust formation, different models exhibit variation in how they estimate trust; in some cases, trustworthiness and demographic factors are weak predictors of effective trust. These findings call for a better understanding of AI-to-human trust dynamics and monitoring of biases and trust development patterns to prevent unintended and potentially harmful outcomes in trust-sensitive applications of AI.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2504.15801

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks

Wang, Julian Junyan, Wang, Victor Xiaoqi

arXiv.org Artificial IntelligenceMar-21-2025

This study provides the first comprehensive assessment of consistency and reproducibility in Large Language Model (LLM) outputs in finance and accounting research. We evaluate how consistently LLMs produce outputs given identical inputs through extensive experimentation with 50 independent runs across five common tasks: classification, sentiment analysis, summarization, text generation, and prediction. Using three OpenAI models (GPT-3.5-turbo, GPT-4o-mini, and GPT-4o), we generate over 3.4 million outputs from diverse financial source texts and data, covering MD&As, FOMC statements, finance news articles, earnings call transcripts, and financial statements. Our findings reveal substantial but task-dependent consistency, with binary classification and sentiment analysis achieving near-perfect reproducibility, while complex tasks show greater variability. More advanced models do not consistently demonstrate better consistency and reproducibility, with task-specific patterns emerging. LLMs significantly outperform expert human annotators in consistency and maintain high agreement even where human experts significantly disagree. We further find that simple aggregation strategies across 3-5 runs dramatically improve consistency. Simulation analysis reveals that despite measurable inconsistency in LLM outputs, downstream statistical inferences remain remarkably robust. These findings address concerns about what we term "G-hacking," the selective reporting of favorable outcomes from multiple Generative AI runs, by demonstrating that such risks are relatively low for finance and accounting tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.16974

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Financial News (1.00)

Industry:

Banking & Finance > Trading (1.00)
Government (0.93)
Banking & Finance > Economy (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Add feedback

New Test-Time Scenario for Biosignal: Concept and Its Approach

Jo, Yong-Yeon, Lee, Byeong Tak, Kim, Beom Joon, Hong, Jeong-Ho, Lee, Hak Seung, Kwon, Joon-myoung

arXiv.org Artificial IntelligenceNov-26-2024

Online Test-Time Adaptation (OTTA) enhances model robustness by updating pre-trained models with unlabeled data during testing. In healthcare, OTTA is vital for real-time tasks like predicting blood pressure from biosignals, which demand continuous adaptation. We introduce a new test-time scenario with streams of unlabeled samples and occasional labeled samples. Our framework combines supervised and self-supervised learning, employing a dual-queue buffer and weighted batch sampling to balance data types. Experiments show improved accuracy and adaptability under real-world conditions.

adaptation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.17785

Country:

Asia > Middle East > Syria > Aleppo Governorate > Aleppo (0.05)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Health Care Technology (0.95)
Health & Medicine > Diagnostic Medicine (0.72)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.70)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Large Language Models Assume People are More Rational than We Really are

Liu, Ryan, Geng, Jiayi, Peterson, Joshua C., Sucholutsky, Ilia, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceJul-1-2024

In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of rational choice -- expected value theory. Interestingly, people also tend to assume that other people are rational when interpreting their behavior. As a consequence, when we compare the inferences that LLMs and people draw from the decisions of others using another psychological dataset, we find that these inferences are highly correlated. Thus, the implicit decision-making models of LLMs appear to be aligned with the human expectation that other people will act rationally, rather than with how people actually act.

correlation, correlation 0, llm, (14 more...)

arXiv.org Artificial Intelligence

2406.17055

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Using Artificial Intelligence to Unlock Crowdfunding Success for Small Businesses

Ye, Teng, Zheng, Jingnan, Jin, Junhui, Qiu, Jingyi, Ai, Wei, Mei, Qiaozhu

arXiv.org Artificial IntelligenceApr-24-2024

While small businesses are increasingly turning to online crowdfunding platforms for essential funding, over 40% of these campaigns may fail to raise any money, especially those from low socio-economic areas. We utilize the latest advancements in AI technology to identify crucial factors that influence the success of crowdfunding campaigns and to improve their fundraising outcomes by strategically optimizing these factors. Our best-performing machine learning model accurately predicts the fundraising outcomes of 81.0% of campaigns, primarily based on their textual descriptions. Interpreting the machine learning model allows us to provide actionable suggestions on improving the textual description before launching a campaign. We demonstrate that by augmenting just three aspects of the narrative using a large language model, a campaign becomes more preferable to 83% human evaluators, and its likelihood of securing financial support increases by 11.9%. Our research uncovers the effective strategies for crafting descriptions for small business fundraising campaigns and opens up a new realm in integrating large language models into crowdfunding methodologies.

campaign description, participant, small business, (17 more...)

arXiv.org Artificial Intelligence

2407.0948

Country:

Europe (0.05)
North America > United States > Illinois (0.04)
North America > United States > Michigan (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.93)

Industry:

Banking & Finance > Economy (1.00)
Education (0.68)
Government > Regional Government > North America Government > United States Government (0.68)
(2 more...)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection

Das, Sourya Dipta, Vadi, Yash, Yadav, Kuldeep

arXiv.org Artificial IntelligenceMar-24-2024

Automated Essay Scoring (AES) systems are widely popular in the market as they constitute a cost-effective and time-effective option for grading systems. Nevertheless, many studies have demonstrated that the AES system fails to assign lower grades to irrelevant responses. Thus, detecting the off-topic response in automated essay scoring is crucial in practical tasks where candidates write unrelated text responses to the given task in the question. In this paper, we are proposing an unsupervised technique that jointly scores essays and detects off-topic essays. The proposed Automated Open Essay Scoring (AOES) model uses a novel topic regularization module (TRM), which can be attached on top of a transformer model, and is trained using a proposed hybrid loss function. After training, the AOES model is further used to calculate the Mahalanobis distance score for off-topic essay detection. Our proposed method outperforms the baseline we created and earlier conventional methods on two essay-scoring datasets in off-topic detection as well as on-topic scoring. Experimental evaluation results on different adversarial strategies also show how the suggested method is robust for detecting possible human-level perturbations.

dataset, detection, precision 0, (15 more...)

arXiv.org Artificial Intelligence

2404.08655

Country: Asia > India (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Education > Assessment & Standards > Student Performance (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

correlation 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Edge-aware baselines for ogbn-proteins in PyTorch Geometric: species-wise normalization, post-hoc calibration, and cost-accuracy trade-offs

1cfa81af29c6f2d8cacb44921722e753-Supplemental.pdf

Variable Selection Using Relative Importance Rankings

Mallows Model with Learned Distance Metrics: Sampling and Maximum Likelihood Estimation

A closer look at how large language models trust humans: patterns and biases

Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks

New Test-Time Scenario for Biosignal: Concept and Its Approach

Large Language Models Assume People are More Rational than We Really are

Using Artificial Intelligence to Unlock Crowdfunding Success for Small Businesses

Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection